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Abstract 

Introduction. The 2020s were marked by the emergence of a new generation of computer simulators using augmented 
reality. One of the promising advantages of augmented reality technology is the ability to safely simulate hazardous 
situations real-world. A prerequisite for realizing this advantage is to provide the visual coherence of augmented reality 
scenes: virtual objects must be indistinguishable from real ones. All IT leaders consider augmented reality as a next 
“big wave”; thus, the visual coherence is becoming a key issue for IT in general. However, it is in aerospace 
applications that the visual coherence has already acquired practical significance. An example is Boeing's development 
of an augmented reality flight simulator, which began in 2022. Visual coherence is a complex problem, one of the 
aspects of which is to provide the correct overall coloration of virtual objects in an augmented reality scene. The 
objective of the research was to develop a new method of such tinting. 

Materials and Methods. The developed method (called spectral transplantation) uses two-dimensional spectral image 
transformations. 

Results. A spectral transplantation technology is proposed that provides direct transfer of color, brightness, and contrast 
characteristics from the real background to virtual objects. An algorithm for automatic selection of the optimal type of 
spectral transformation has been developed. 

Discussion and Conclusion. Being a fully automatic process without recording lighting conditions, spectral 
transplantation solves a number of complex problems of visual coherence. Spectral transplantation can be a valuable 


addition to other methods of providing visual coherence. 
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AnHOTalnA 

Beedenue. 2020-e roqbl O3HAMCHOBAIMCh TOABJIGHHCM HOBOTO MOKOJICHHA KOMIIbIOTEPHbIX TPeHaKePOB C 
IIPHMeHeEHHeM TeXHOJIOrHM AOMOUHEHHOM peasIbHOCTH. OHO 43 NpPeHMYyecTB JaHHOM TeEXHOJOrHH — BO3MO2KHOCTb 
Oe30MacHOro MOJCIMpOBaHHA ONaCHbIX CHTyalMii B peasIbHOM Mupe. HeoOxogHMbIM yYCIOBHeM HCHOIb30BaHHA 3TOTO 
IIpeHMyllecTBa ABIAeTCA OOECHeYeHHe BU3YaIbHOM KOTePeCHTHOCTH CI[eH AOMOMHEHHOM peasIbHOCTH: BUPTyaJIbHble 
OObEKTEI OJDKHEI ObITh HEOTIIMYMMBI OT peasbHBEIx. Bce muposbie IT-nuyepbr paccMaTpHBaIoT JOMOJIHeHHYIO 
peaIbHOCTbh KaK CJI€MYIOWyIO BOJHY paMKasIbHbIX H3MeHeHH B WHppoBol cpefe, NOSTOMy BH3yasIbHaa 
KOTe€peHTHOCTb CTaHOBHTCA KJIIOUEBBIM BOIIpOcoM Wa Oyzyulero IT, a B aspPOKOCMHYECKHX MPHIOKCHUAX BH3YAaJIbHaAd 
KOrepeHTHOCTh yxe MpHoOpena upakTHyeckoe 3Ha4yeHHe. IIpuMepoM MOxeT CILyKHTb pa3spaOoTKa KopmopayHel 
Bouur MuuOTCKOrO TpeHaxepa C QONONHEHHOM peasbHoOcTbIO (2022). Bu3yanbHad KOrepeHTHOCTb — CJIO2%KHaA 
KOMIVIeKcHad mpoOsemMa, OJHHM H3 acieKTOB KOTOpOM sABUIAeTCA OOeCHeYeHHe KOPpeKTHOM kKoMOpucTuyecKoH 
TOHMPOBKH BUPTYaJIbHbIX OOBEKTOB B ClleHe AONOUHEHHON peambuoctTu. Lien padoTsr — pa3paboTKa HOBOrO MeTOa 
TaKOM TOHHPOBKH. 

Mamepuaavi u memoooi. B pa3spa0oTaHHOM MeToje (Ha3BaHHOM CII€KTpasIbHOM TpaHCiaHTalMen) UCMONb3YIOTCA 
J{BYMEPHBle CieKTpaJIbHble IpeoOpa30BaHuA H300paxKeHHH. 

Peszynomamoti uccredoeanua. I[peaioxwena TeEXHOOrMA CHeKTpabHOM TpaHciiaHTayun, oOechewnBalolsad IpAMyto 
Tlepeqauy xapakTepHCTHK WBeTa, APKOCTH HW KOHTpacta OT peabHoro (:boHa K BUPTYaJIbHbIM OOBeKTaM. Pa3spaboTaH 
ayIFOPUTM aBTOMATHYECKOTO BbIOOpa ONTHMAJIbHOTO BH Ja CIeKTpasIbHOrO MpeoOpa30BaHHA. 

O6cyacdenue u 3akniouenue. byyun MONHOCTbIO aBTOMaTH4eCKHM TpoweccoM 6e3 peructTpaluu ycroBui 
OCBeINeHHOCTH, CII€KTpasIbHad TpaHciaHTauuA pelllaeT pA CIOKHIX TpoOseM BH3yaIbHOM KOrepeHTHOCTH. 
CrekTpasIbHad TpaHCiaHTallHA MOKET CTaTb ICHHbIM JOMOJHeEHHeM K J{pyrMM MeTOJaM OOecrieyeHHA BH3yasIbHOH 


KOTepeHTHOCTH. 
Ksr0ueBble CJI0Ba: KOMIIBIOTCpHble TpCHaKepbl, WONOJHCHHAA PealsIBHOCTh, BH3yasIbHawA KOTCPpCHTHOCTh 


BuaarofapHoctu: aBTop BbIpaxaeT OnarogapHocTp A. Tepenuu (Inglobe Technologies Srl, Uexxano, Uranus) 3a 


TOWWepxKy B pa3paboTKe MmporpaMMHOro oOecrieyeHHaA. 


Aaa waTupopanna. CopOyxos A.J]. BusyanbHad KorepeHTHOCTh B JOMOMHeEHHOM peambHoctu. Advanced Engineering 
Research (Rostov-on-Don). 2023;23(2):180—190. https://doi.org/10.23947/2687-1653-2023-23-2-180-190 


Introduction. Modern simulators actually by default imply the use of virtual reality (VR). The advantages of this 
approach are well known; therefore, we will not dwell on them, but we will note a number of significant and, more 
importantly, insurmountable disadvantages due to the very nature of virtual reality technology. VR is a digital, discrete 
technology, while the real world is continuous. Therefore, modeling the real world in VR is inevitably associated with 
errors, which reduces the efficiency of training. However, for training systems, an even more serious negative aspect 
is that human decisions are largely based on subconscious consideration of numerous details of the real picture of the 
world. This process is fundamentally impossible to reproduce using purely computer technologies (e.g., VR) for two 
reasons: we still do not know (and are unlikely to ever know) what the mechanism of the human brain is. The latest 
speculations on the topic of artificial intelligence only confirm this. The details of the real world taken into account 
when making decisions are almost infinite in number, they arise randomly and are of quite a different nature (visual, 


acoustic, tactile ...). 
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The emergence of augmented reality (AR) training systems in the 2020s reduced the severity of this problematic 
situation. Examples are the development by Boeing of an augmented reality pilot simulator based on the well-known 
R6 ATARS project, which began in the fall of 2022, as well as a similar project launched by British BAE Systems or 
an air traffic control training simulator from this article. All the information wealth of the world around us in AR is 
presented explicitly and does not require modeling. But it is needed to solve the problem of visual coherence (VC) to 
realize the advantages of AR associated with the parallel presence of real and virtual objects in scenes: virtual objects 
must be indistinguishable from real ones. This article proposes a method for solving the problem of visual coherence 
in the framework of a project on the development of a training system for air traffic controllers. 

AR is a derivative form of VR. AR retains all the features of VR, but, in addition, as a hybrid technology, it has 
significant advantages arising from the parallel coexistence of virtual and real objects, which attracts the attention of 
developers to VC. Moreover, studies [1] show that among the negative psychophysiological consequences of using 
augmented reality devices, optical discomfort dominates, which occurs due to the difference in perception of real and 
virtual objects in the same scene due to the absence of VC. IT industry leaders see AR as the next “big wave” of 
revolutionary changes in digital electronics. Therefore, the VC problem is becoming a key one for IT as a whole, and 
these leaders show a growing interest in methods of solving it [2]. However, the problem of visual coherence has 
already acquired practical significance in aerospace applications. The authors encountered a VC problem when 
developing a training system for air traffic controllers: the rapid increase in the intensity of air traffic at airports caused 
an increase in the frequency of collisions of aircraft with other aircraft and airfield transport during ground 
maneuvering (>50 cases worldwide in 2018 before the outbreak of the pandemic). Air traffic controllers working on 
airport towers are not always ready to respond adequately to such emergency situations, which requires additional 
training. The most effective form of such training involves presenting the dispatcher with a situation of hazardous 
proximity of objects on the airfield, which is impossible with real objects, but can be absolutely safely implemented 
in augmented reality scenes. In our application, emergency situations were safely simulated using AR at a real airfield, 
while the virtual aircraft used should be indistinguishable from real ones. 

An exhaustive overview of the known VC methods can be found in [3]. According to the author's classification, all 
VC methods can be divided into two main classes: with the measurement of lighting parameters, and with the 
assessment of lighting conditions. In the first case, a mandatory procedure is a preliminary measurement of illumination 
conditions, carried out with the help of special equipment. This procedure is a long and labor-intensive process. It 
seems to be impossible if a pre-obtained image or video of the real world is used. In the second case, the complexity 
of reconstructing the lighting pattern from images causes assumptions and limitations, which makes the results 
ambiguous. Therefore, despite the impressive results obtained by researchers using the methods mentioned in the 
review [3], the VC level is still often insufficient, specifically, in AR scenes with real natural landscapes under ambient 
lighting conditions, which are typical for aviation applications. As the review of publications below shows, there is a 
shortage of research of this kind. 

This work was aimed at developing a universal and automatic method to provide direct transfer of color, brightness 
and contrast characteristics from a real background to virtual objects without digital 3D modeling, which was required 
in existing VC approaches. The method is based on the mathematical apparatus of two-dimensional spectral 
transformations, we called it “spectral transplantation”. 

The key results of this study are: 

— basic scheme for the spectral transplantation method, which provides a direct transfer of color, brightness and 
contrast characteristics from the real background to virtual objects. The method involves replacing a part of the 
spectrum of the image of the virtual world with the same part of the spectrum of the image of the real world, followed 
by an inverse transformation of the spectrum with the transplanted part; 

— algorithm for automatic selection of the optimal type of spectral transformation for use in spectral transplantation. 

It is important to note that VC depends on many factors: lighting, shadows, color tone, mutual reflections, surface 
texture, optical aberrations, convergence, accommodation, etc. Accordingly, various AR visualization techniques were 
used. In our case, VC is provided only for the factors of general illumination and coloring of virtual objects in AR. 
This is one of the VC challenges, especially for outdoor scenes. Therefore, spectral transplantation should be used in 


combination with other VC methods to achieve full VC. 
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The list of sources in [3] includes 175 positions; this review includes almost all approaches to achievements in VC 
(with the exception of the latter, based on neural networks, discussed below). Therefore, here we will briefly describe 
some characteristic examples that correspond to the mentioned basic classes. 

Measurement of lighting conditions 

Using a light probe with diffuse bands between mirror spherical quadrants, P. Debevec and others [4] demonstrated 
how the full dynamic color range of a scene could be reconstructed from a single exposure. Based on the image obtained 
with the probe, the intensity of several light sources could be estimated by solving a simple linear system of equations. 
The results were used to render a virtual diffuse sphere. 

A. Alhakamy and M. Tuceryan [5] estimated the direction of incident light (direct illumination) of a real scene 
using computer vision techniques with a 360° camera attached to an AR device. The system simulated the light 
reflected from surfaces when rendering virtual objects. Then, the shadow parameters for each virtual object were 
determined. 

Assessment of lighting conditions 

S.B. Knorr and D. Kurtz [6] proposed a scheme for assessing lighting conditions in the real world based on a photo 
of a human face. The method was based on training a model of the type of face based on a database of faces with 
known lighting. The authors then reconstructed the most plausible lighting conditions in the real world in the basis of 
spherical harmonics for the captured face. 

We should mention work [7], which described a combination of measurement and evaluation of illumination. The 
authors measured the reflective properties of real objects using depth maps and color images of a rotating object on a 
turntable using an RGB-D camera. The shape of the object was reconstructed through integrating images of the depth 
of the object obtained from different viewpoints. The reflectivity of an object was determined by evaluating the 
parameters of the reflection model from reconstructed images of shape and color. 

The closest analogues of the proposed method are approaches that, like spectral transplantation, do not involve 
preliminary measurements of lighting and simulation of lighting conditions, scene geometry, surface reflection, and 
also provide for automatic processing. 

Among such analogues, there are methods of color transfer from image to image. Paper [8] presented a method for 
automatic transferring color statistics (averages and standard deviations) from the reference image to the tar get image. 
Additional parameters were used to avoid manual processing, which was required to determine the features of color 
transmission in cases where images had a strong difference in the color palette. These additional parameters combined 
the variances of the reference and target images. The authors of the article claimed that, although manual modification 
of these parameters was extremely rare, it was nevertheless sometimes necessary. In addition, the statistical nature of 
the method raised questions about the type and scope of statistics. Also, the ability of the method to process certain 
types of images (containing shiny objects, shadows) was not obvious. 

Xuezhong Xiao and Lizhuang Ma [9] presented an algorithm to solve the problem of color transmission reliability 
in terms of scene details and colors. The authors considered the preservation of the color gradient as a necessary 
condition for the authenticity of the scene. They formulated the problem of color transfer as an optimization problem 
and solved it in two stages — histogram matching and gradient-preserving optimization. A metric was proposed for 
an objective assessment of the efficiency of color transfer algorithms based on examples. 

The advantages of the developed method, in comparison to [8, 9] and their numerous analogues, are its versatility, 
fully automatic nature, and the ability to transfer not only color, but also all the main characteristics of the image using 
one simple procedure. 

The proposed method uses two-dimensional spectral transformations. Various types of images are optimally 
described by different types of spectral transformations (“‘optimally” — in the sense of matching visual perception for 
real and virtual objects). Actively used in digital image processing since the advent of digital television are the Discrete 
Fourier Transform, Discrete Cosine Transform, Hadamar Transform, S-Transform, and Karhunen-Loeve Transform. 

Materials and Methods. The scheme of the spectral transplantation method (the version using the Fourier 
transform [10]) is shown in Figure |. Frames of the real world (world frame — WF) and virtual world (virtual frame — 


VF) are used as input data (Fig. 2). This is natural for AR “video” (when the real world is observed through a video 
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camera). For “transparent” AR, when the real world is observed through transparent glasses, real images are captured 


using cameras located on AR glasses. 


: reverse corrected 
source color direct Low-frequency 2-dimensional VE 
images channels 2-dimensinal part (LFP) Fourier vestoration 

extraction Fourier transplantation Transform andl AR scene 
(DFT) (RFT) composing 


RFT(DFT(VFr)) 


Virtual Corrected 
world virtual 
frame — world 

frame 


Fig. 1. Scheme of the spectral transplantation method. 


A version using the Fourier transform 


Fig. 2. Real world (WF) and virtual world (VF) video frames: 
a — WF, airport, cloudy weather; b — WF, airport, sunny weather; c — VF, 


virtual airplane. WF are small fragments (<25%) of images published on websites sydneyairport.com.au and 6sqft.com. 


The goal of this method is to transfer the main characteristics of the image from WF to VF. The scheme of the 
method is very simple, although the operations have a large computational volume. The method is implemented in five 
stages (Fig. 1): 

1) Selection of color (RGB) channels for WF — WFr, WFg, WFb and for VF — VFr, VFg, VFb. The RGB model 
is used because of its generality and correlation between channels, which is specific to this model. 

2) Calculation of two-dimensional direct Fourier transform (direct Fourier transform — DFT): DFT(WFr), 
DFT(WFg), DFT(WFb), DFT(VFr), DFT(VFg), DFT(VFb). The DFT formula is given below: 

M-\N-1 
XD =F DL Almere jens Ty (l) 
where c = R, G, B — index for red, green and blue color image channels; M, N — row and column numbers of the 
pixel matrix of the transformed image; k, / — spatial frequency arguments; x,(m,n) — pixel value with spatial 
coordinates (m,n) in channel c; X.(k,/) — complex numbers. 

3) This is a key stage. Transplantation of low-frequency part (LFP) is carried out between pairs of WF and VF 
spectra for each of the red, green and blue channels. This means that VF LFP is replaced by the corresponding WF 
LFP. The idea of spectral transplantation is based on the following property of DFT: the general character of the image 
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(i.e., color hue, brightness, contrast) depends on the spatial frequencies contained in LFP (including the constant 
component) of its two-dimensional spectrum. 

Thus, by transplanting WF LFP into VF spectrum, we transfer the main characteristics of the image from WF to 
VF. For this, it is more convenient to use a centered form of a two-dimensional spectrum, where the constant 
component is located in the center of the matrix of spectral coefficients, and the low-frequency components are 
symmetrically arranged around the constant component. In a centered spectrum, LFP is the central part of the DFT 
matrix, and LFP has the size M1 x NI (MI <M, NI <N). If M1 =NI, then the notation for the square matrix LFP can be 
LFP(012..F), where 0 — constant component, F — number of the largest spatial frequency in LFP matrix. 

The size of the LFP for transplantation depends on the size of the transformed image (this size determines the 
spectral resolution) and on the volume of image characteristics that should be borrowed from WF. At this stage of 
research, the size of LFP is determined empirically. For example, the best visual results for 512x512-pixel images 
were obtained using LFP (012345). 

4) Restoring RGB channels for VF using a two-dimensional reverse Fourier transform (RFT). While at this stage, 
the characteristics of WF and VF are mixed. As a result, RGB channels of the VF image are obtained with the main color, 
brightness and contrast characteristics of the WF, as well as with characteristics inherited from the original VF. 

5) Restoring the corrected VF color through merging the RGB channels obtained at the previous stage, cutting out 
virtual objects and building an AR scene by superimposing the cut virtual objects on WF. 

Obviously, if this method is used to process the WF video stream, then there is no need to calculate DFT, and, 
accordingly, LFP for each frame of the real world, since the main characteristics of the image are chan ged only with a 
radical change in the recorded scene. Such changes can be easily detected by jumps in the average pixel value. At these 
moments, it is needed to recalculate the spectral transformation for LFP. 

Since various types of images are optimally described by different types of spectral transformations mentioned 
above, it is reasonable to develop an automatic algorithm for selecting the optimal type of transformation for use in 
spectral transplantation. 

We propose to estimate the difference between the visual perception of VF and WF by the RMS distance 4 between 
the LFP power spectra of their images (for all color channels): 

+S SB. (kD — Pyksl) 


) k=01=0 


Ave (Ree 


L 


(2) 
where Py and Pw — two-dimensional power spectra of VF and WF, respectively. For example, in the case of the 


Fourier transform, the formula for P has the form: 


M-1N=1 km nl. | 
P (k,l) = (m, —j2n(—+—))] . 
(kL) =| XY x, (m,n) exp(-j oT are, 


m=0 n=0 


(3) 


We propose to determine the optimal type of spectral transformation by the proximity of the vectors 4 and the mean 
vector calculated by the criterion of the minimum sum of squares of the distances between the mean vector and the 
vectors A for all transformations under consideration. 

Let 4;(A;R, 4jG, 4;B) be the normalized vector of the distance between spectrum VF and WF LFPs for conversion j. 
Let Aq(AaR, MaG, AaB) be mean vector, and D; — distance between 4; and 4,. Then, the sum of S squared distances 


from the vectors 4; of all transformations under consideration to the mean vector is equal to: 


S=ED) =T TA, -A,), c= RGB. (4) 
P | joe 
Coordinates Agr, 4ac, Jap of the mean vector are calculated as the solution to a system of partial differential 
equations: 
25: c=R,G,B. (5) 
OA,. 


The selection of the optimal type of spectral transformation is determined by the proximity condition: 
min D,. (6) 


J 
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Another obvious criterion for selecting the optimal type of transformation is the length of the vectors 4. However, 
the extremes of such a criterion may be related to the ability or inability of certain transformations to correctly detect 
the difference between certain types of WF and VF. Therefore, we consider the use of the mean vector as a more 
reliable method of selection. 

Similar to the DFT calculation for WF, the optimal type of transformation is selected only once at the beginning of 
spectral transplantation, unless WF is changed dramatically. 

Research Results. The proposed method was tested using the Fourier transform without selecting the optimal type 
of transformation. WF (real airport scene) and VF (virtual airplane model) had a size of 512=12 pixels and 24-bit 
colors. Two different conditions were investigated: 

1) WF — photo of the airport in cloudy weather (Fig. 2 a); 

2) WF — photo of the airport in sunny weather (Fig. 2 b). 

In both cases, VF contained a 3D model of the aircraft shown in Figure 2 c. LP(0), LP(01), LFP(012), LFP(0123), 
LFP(01234), LFP(012345) transplants were tested. Some of the test results are shown in Figure 3. The best visual 
results were obtained using LFP(012345). In Figure 3, the images after spectral transplantation are intentionally shown 


without other VC effects (shadows, lighting, etc.) to demonstrate the pure results of this method. 


LFP(0123) : LFP(012345) 


Fig. 3. AR-scene: a — consisting of WF and VF without LFP transplantation; b — AP-scene after transplantation 
LFP(0123); c — AR-scene after transplantation LFP(012345) 


The upper and lower rows in Figure 3 correspond to the opposite conditions for WF: light and dark WF with 
different shades. Experiments with any other WF will not add significantly new information since they will have 
conditions between those already presented in Figure 3. 

Numerical simulation was carried out to demonstrate the mechanism of spectral transplantation. Figure 4 shows 
Fourier transplantation using a small (8x8) pixel matrix representing one of the color channels WF and VF. Such a 
small size of the matrix enables to clearly illustrate the transplantation procedure. In this example, the WF matrix can 
be associated with an image with a vertical gradient fill, and the VF matrix — with an image with a horizontal gradient 
fill. Another difference between WF and VF is the range of pixel values: 8-15 for WF (“lighter image”, 8 is a constant 
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component) and 0-7 for VF (“darker image”). LFP(01') transplantation is shown, where |' means part of the first spatial 
frequency component (used because of the very low resolution of the 8x8 matrix). The 3D form of the VF matrix 
after transplantation indicates the transfer of properties from WF to VF: the edge of the surface has risen; the first pixel 
has received the value of the constant WF component. This example demonstrates how, as a result of spectral 


transplantation, VF starts to acquire a vertical gradient and a constant component. 


3D view of WF pixel matrix 3D view of VF pixel matrix 3D view of VF matrix with LPF (01’) 
i : 1o aN 20 
0 AZZ IS 8 j <> 15 
La AY LY sr sar a | 6 i i a SFE 
5 ad 4 f 10 LIST aa OS 
tL \ 7 EE SET SEE 7 
0 _2 » NEE T 
0° 0 — 
Two-dimensional WF spectrum Two-dimensional VF spectrum VF spectrum with LFP (01’) 


LFP(1 ae 
LFP(0) LFP(01’) transplantation 


Fig. 4. Numerical simulation of spectral transplantation for 8x8-pixel matrices 


Spectral transplantation provides several options for changing the parameters of this procedure: changing LFP size; 
selecting individual components of the spectrum for transplantation; using different transplant coefficients for various 
components to be transplanted. 

Figure 5 shows the effect of transplantation with different parameters for various types of virtual objects — virtual 
aircraft models that differ in surface texture, markings, and gloss. Figures 5 a and b depict a virtual airplane with 
complex textures, text symbols and reflections of virtual light sources. Figures 5 d, e and f show a virtual plane with 
simple contrasting colors. Parts a and d contain virtual objects without transplantation; b and e contain virtual objects 
after LFP transplantation(0123); c and f contain virtual objects after LFP transplantation(012345). Virtual objects are 
intentionally shown without other VC effects (shadows, lighting, etc.) to demonstrate the pure results of the method. 
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Fig. 5. Scenes with cloudy WF: a, d— AR-scenes consisting of WF and VF without LFP transplantation; 
b, d— AR-scenes composed after LFP transplantation (0123); c, f— AR-scenes composed after LFP transplantation (012345) 


It is important to emphasize that the presented figures illustrate the possibilities of tuning the proposed method, 
and not the final result, since it requires tuning to specific WF. The demonstration of well-executed, but incomplete 
results, as is often practiced in VC works, does not seem correct to us. 

Discussion and Conclusions. The key complicating factor for the described method, presented in Figure 1, is the 
high computational costs. The most promising way to solve this problem is to directly convert WF LFP parameters 
into VF rendering parameters. This eliminates the cumbersome procedures of three DFT and three RFT calculations 
at the second and fourth processing steps, and requires only three WF DFT calculations, once for each section of the 
WF flow without significantly changing the basic characteristics of WF. This approach provides processing the VF 
flow in real time. 

Another problem is selecting the optimal LFP size. As the volume of spatial frequencies used increases, they begin 
to hold information about the WF contents. Therefore, limiting the size of LFP is needed to eliminate the effect of a 
hybrid image [11]. The complexity of the optimal selection is conditioned by its association with both the LFP size 
and the nature of the image. Recent advances in deep learning suggest that a new approach related to visual coherence 
through spectral transplantation could be the use of generative adversarial networks (GAN) to transmit realistic lighting 
information from the source image to the target image in the same way that GAN do to transmit image style. In 
particular, it would be interesting to compare the performance of GAN in the case of data sets consisting of either RGB 
images or images represented in the frequency domain using DFT. We believe that the latter approach will help to 
select the optimal LFP. GAN are already widely used in VC study [2] as are neural networks in general [12]. 

In further research related to the topic of this paper, the following issues will be considered: 

— automatic determination of the optimal LFP size for transplantation with a given volume of characteristics 
borrowed from WF; 

— automatic detection of the exact moments when it is required to calculate new WP LFP for transplantation when 
processing WF video in real time (as mentioned above, this must be done if the basic characteristics of WF are radically 
changed); 

— using the same approach in reverse (from virtual to real) to apply virtual lighting to real scenes (how virtual 
lighting affects the environment). 

As a fully automatic process without measuring illumination, the proposed spectral transplantation method solves 
a number of complex VC problems. Let us say, how to best align the color, brightness, and contrast characteristics 
between real and virtual components in AR scenes. All these tasks are solved through one simple procedure without 
modeling lighting conditions, AR-scene geometry or BRDF, which eliminates the inevitable modeling errors. The 


proposed method can be a valuable addition to other VC tools. 


Gorbunov AL. Visual Coherence for Augmented Reality 


References 

1. Hughes CL, Fidopiastis C, Stanney KM, et al. The Psychometrics of Cybersickness in Augmented Reality. 
Frontiers in Virtual Reality. 2020;1:602954. https://doi.org/10.3389/frvir.2020.602954 

2. Somanath G, Kurz D. HDR Environment Map Estimation for Real-Time Augmented Reality. Cupertino, CA: 
Apple Inc.; 2020. URL: https://arxiv.org/pdf/2011.10687.pdf (accessed: 17.06.2022). 

3. Kronander J, Banterle F, Gardner A, et al. Photorealistic Rendering of Mixed Reality Scenes. Computer 
Graphics Forum. 2015;34(2):643—665. https://doi.org/10.1111/cgf.12591 (accessed: 17.06.2022). 

4. Debevec P, Graham P, Busch J, et al. A Single-Shot Light Probe. SIGGRAPH ’12: ACM SIGGRAPH 2012 Talks. 
2012;10:1—19. URL: https://vgl.ict.usc.edu/Research/SSLP/A_Single Shot Light Probe-SIGGRAPH2012.pdf 
(accessed: 17.06.2022). 

5. Alhakamy A, Tuceryan M. CubeMap360: Interactive Global Illumination for Augmented Reality in Dynamic 
Environment. In: Proc. IEEE SoutheastCon. Huntsville, AL: IEEE; 2019. 
https://doi.org/10.1109/SoutheastCon423 11.2019.9020588 

6. Knorr SB, Kurz D. Real-Time Illumination Estimation from Faces for Coherent Rendering. In: Proc. IEEE Int. 
Symposium on Mixed and Augmented Reality (SMAR). Munich: JTEEE; 2014. P. 113-122. 
https://doi.org/10.1109/ISMAR.2014.6948483 

7. Seiji Tsunezaki, Ryota Nomura, Takashi Komuro, et al. Reproducing Material Appearance of Real Objects 


using Mobile Augmented Reality. In: Proc. 2018 IEEE International Symposium on Mixed and Augmented Reality 
Adjunct ISMAR-Adjunct). Munich: IEEE; 2018. P. 196-197. https://doi.org/10.1109/ISMAR-Adjunct.2018.00065 
8. Reinhard E, Akyuz AO, Colbert M, et al. Real-Time Color Blending of Rendered and Captured Video. In: Proc. 


Interservice/Industry Training, Simulation and Education Conference (I/ITSEC). Orlando, Florida: National Training 


and Simulation Association; 2004. P. 1-9. URL: https://user.ceng.metu.edu.tr/~akyuz/files/blend.pdf (accessed: 
17.06.2022). 

9. Xuezhong Xiao, Lizhuang Ma. Gradient-Preserving Color Transfer. Computer Graphics Forum. 
2009;28(7):1879—1886. https://doi.org/10.1111/).1467-8659.2009.01566.x 

10.Gorbunov AL, et al. Sposob formirovaniya izobrazheniya dopolnennoi real'nosti, obespechivayushchii 
sovpadenie vizual'nykh kharakteristik real'nykh i virtual'nykh ob"ektov. RF Patent No. 2667602. 2019. (In Russ.) 

11.Oliva A, Torralba AJ, Schyns PhG. Hybrid Images. ACM Transactions on Graphics. 2006;25(3):527—-532. 
https://doi.org/10.1145/1179352.1141919 

12.Kan P, Kafumann H. DeepLight: Light Source Estimation for Augmented Reality Using Deep Learning. The 
Visual Computer. 2019;35:873—883. https://doi.org/10.1007/s0037 1-0 19-0 1666-x 


Received 10.04.2023 
Revised 18.04.2023 
Accepted 18.04.2023 


About the Author: 

Andrey L. Gorbunov, Cand.Sci. (Eng.), Associate Professor, Air Traffic Control Department, Moscow State 
Technical University of Civil Aviation (20, Kronshtadtskii blvd, Moscow, 125993, RF), Director-General, “Aviareal” 
LLC (5, Zagoryevskaya St., Moscow, 115372, RF), ORCID, ResearcherID, ScopusID, AuthorID, 


a-gorbunov @ mail.ru 


Conflict of interest statement: the author does not have any conflict of interest. 
The author has read and approved the final manuscript. 
Tlocrynnsa B pegzaxunto 10.04.2023 


Tlocrynuiia nocie penen3suposanna 18.04.2023 
Ipunata k nyOsmkanun 18.04.2023 


Information Technology, Computer Science and Management 


189 


http://vestnik-donstu.ru 


190 


Advanced Engineering Research (Rostov-on-Don). 2023;23(2):180-190. eISSN 2687-1653 
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Konqdauxm unmepecos6: aBrop 3a1B1AeT 00 OTCYTCTBHM KOH@JIMKTa HHTepecos. 


Aemop npouwumaa u odobpua OKOHYaMeNbNbIN BapuaHm pyKoNUcU. 


